A New Filtering Algorithm for Duplicate Document Based on Concept Analysis

نویسنده

  • Ahmad M. Hasnah
چکیده

Data bases and web pages contain currently a huge number of duplicate document. It is then fundamental to have a filter which can be embedded, for instance, within an information retrieval system like a search engine in order to prohibit the redundant documents references to appear on the screen as a reply to the user's query. This filter can save the user time and increases his satisfaction. In this study, we propose a new algorithm based on concept analysis principle, which can act as a filter for duplicate document. It can be applied on a collection of documents or databases and reduce their storage spaces by eliminating redundant documents without loosing knowledge. Our experiments show that this algorithm increases the precision of the information retrieval system and improves its performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comprehensive Analysis of Dense Point Cloud Filtering Algorithm for Eliminating Non-Ground Features

Point cloud and LiDAR Filtering is removing non-ground features from digital surface model (DSM) and reaching the bare earth and DTM extraction. Various methods have been proposed by different researchers to distinguish between ground and non- ground in points cloud and LiDAR data. Most fully automated methods have a common disadvantage, and they are only effective for a particular type of surf...

متن کامل

تشخیص چهره با استفاده از PCA و فیلتر گابور

Methods for face recognition which are based on face structure are among techniques without supervision and produce unfavorable results in the presence of linear changes in images. PCA is a linear transform and a powerful tool for data analysis but does not produce good results for face recognition when there are non-linear changes resulting from changes in position, intensity and gesture in th...

متن کامل

A New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation

Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...

متن کامل

Optimized computational Afin image algorithm using combination of update coefficients and wavelet packet conversion

Updating Optimal Coefficients and Selected Observations Affine Projection is an effective way to reduce the computational and power consumption of this algorithm in the application of adaptive filters. On the other hand, the calculation of this algorithm can be reduced by using subbands and applying the concept of filtering the Set-Membership in each subband. Considering these concepts, the fir...

متن کامل

A New Iterative Fuzzy-Based Method for Image Enhancement (RESEARCH NOTE)

This paper presents a new filtering approach based on fuzzy-logic which has high performance in mixed noise environments. This filter is mainly based on the idea that each pixel is not allowed to be uniformly fired by each of the fuzzy rules. In the proposed filtering algorithm, the rule membership functions are tuned iteratively in order to preserve the image edges. Several test experiments we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006